Trading Off Memory For Parallelism Quality
نویسندگان
چکیده
We detail an algorithm implemented in the R-Stream compiler to perform controlled array expansion and conversion to partial single-assignment form, which consists of (1) allowing our automatic code optimizer to selectively ignore false dependences in order to extract a good tradeoff between locality and parallelism, (2) detecting exactly all the causes of semantics violations in the relaxed schedule of the program and (3) incrementally correcting violations by minimal amounts of renaming and expansion. In particular, our algorithm may ignore all false dependences and extract the maximal available parallelism in the program given a limit on the amount of expansion. The spectrum of memory consumption then varies between no expansion and total single assignment, with many steps between those extremes. The exposed parallelism can be incrementally reduced to fit more tightly the number and organization of processing elements available in the targeted hardware, and, by the same token, to reduce the program’s memory footprint. We extend our correction scheme in an iterative algorithm to tailor the mapping of the program for a good tradeoff between parallelism, locality and memory consumption. We demonstrate the power of our technique by optimizing a radar benchmark comprising a sequence of BLAS calls. By applying our technique and optimizing at a global level, we reach significant performance improvements over an implementation based on vendor optimized math library calls. Our technique also has implications on algorithm selection.
منابع مشابه
Latency-Tolerant Software Distributed Shared Memory
We present Grappa, a modern take on software distributed shared memory (DSM) for in-memory data-intensive applications. Grappa enables users to program a cluster as if it were a single, large, non-uniform memory access (NUMA) machine. Performance scales up even for applications that have poor locality and input-dependent load distribution. Grappa addresses deficiencies of previous DSM systems b...
متن کامل(SVR-GA) and multilayer perceptron optimized with GA (MLP-GA). Experimental results show that both approaches outperform conventional trading systems without prediction and a recent fuzzy trading system in terms of final equity and maximum drawdown for Hong Kong
This paper proposes an intelligent trading system using support vector regression optimized by genetic algorithms (SVR-GA) and multilayer perceptron optimized with GA (MLP-GA). Experimental results show that both approaches outperform conventional trading systems without prediction and a recent fuzzy trading system in terms of final equity and maximum drawdown for Hong Kong Hang Seng stock index.
متن کاملAn ILP-based DMA Data Transmission Optimization Algorithm for MPSoC
With the rapid development of integrated circuit design technology and the processed tasks and data volumes growing, MPSoC is becoming increasingly popular in a variety of applications. In MPSoC design, parallelism is a very important issue, for example, how to realize task parallelism and data parallelism. Focusing on this issue, this paper analyzes the role of DMA and presents an ILP-Based DM...
متن کاملTrading off Parallelism and Numerical Stability
The fastest parallel algorithm for a problem may be signi cantly less stable numerically than the fastest serial algorithm. We illustrate this phenomenon by a series of examples drawn from numerical linear algebra. We also show how some of these instabilities may be mitigated by better oating point arithmetic.
متن کاملCombining Local and Global History for High Performance Data Prefetching
In this paper, we present our design for a high performance prefetcher, which exploits various localities in both local cache-miss streams (misses generated from the same instruction) and the global cache-miss address stream (the misses from different instructions). Besides the stride and context localities that have been exploited in previous work, we identify new data localities and incorpora...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011